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Arguably the deepest fact known about the von Neumann entropy, the strong subad- 
ditivity inequality is a potent hammer in the quantum information theorist's toolkit. 
This short tutorial describes a simple proof of strong subadditivity due to Petz [Rep. 
on Math. Phys. 23 (1), 57—65 (1986)]. It assumes only knowledge of elementary linear 
algebra and quantum mechanics. 
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1 Introduction 

The von Neumann entropy of a density matrix p is defined by S{p) = — tr(/3lnp). Suppose 
Pabc is a density matrix for a system with three components, A, B and C. The strong 
subadditivity inequahty states that 

S{pABc) + S{pb) < S{pab) + S{pBc), (1) 

where notations hke pB denote the appropriate reduced density matrices. 

The strong subadditivity inequahty appears quite mysterious at first sight. Some intuition 
is gained by reexpressing strong subadditivity in terms of the conditional entropy S{A\B) = 
S{pab) — S{pb)- Classicahy, when the von Neumann entropy is replaced by the Shannon 
entropy function, the conditional entropy has an intepretation as the average uncertainty 
about the state of A, given knowledge of the state of S 0- Although this interpretation is 
more problematic in the quantum case — for one thing, the quantum conditional entropy 
can be negative! — it can still be useful for developing intuition and suggesting results. In 
particular, we see that strong subadditivity may be recast in the equivalent form 

S{A\BC) < S{A\B). (2) 

That is, strong subadditivity expresses the intuition that our uncertainty about A when B 
and C are known is not more than when only B is known. This intuition is perhaps best 
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viewed as a mnemonic, due to the problematic interpretation of the conditional entropy, but 
may nonetheless be helpful. 

Strong subadditivity has many applications in quantum information theory (see, e.g., [HI 
in]). Our purpose here is not to discuss these applications, but rather to provide an expository 
account of a simple proof of strong subadditivity due to Petz (see also jH]). In so doing 
we hope to help publicise this proof to a wider audience. The reader looking for a more 
comprehensive account in a similar vein to the present paper should consult jljj. 

Our proof strategy is to show that strong subadditivity is implied by a related result, the 
monotonicity of the relative entropy, and then to prove this monotonicity result. The relative 
entropy between density matrices p and a is defined as: 

S{p\\a-) = tr(/3ln/3 — plncr). (3) 

Roughly speaking, the relative entropy is a measure of the distance between p and a. In 
particular, it can be shown that 5'(yo||(T) > 0, with equality if and only if p = a. Be warned, 
however, that it is not symmetric in p and ct, and S'(p||cr) diverges unless the support of p is 
contained within the support of a. Further background on the relative entropy may be found 
in [HI El • The monotonicity of the relative entropy is the property that discarding part of a 
composite system AB can only decrease the relative entropy between two density matrices 
PAB and aAB- 

S{pa\\<Ja) < S{pab\\<Jab)- (4) 

To see that monotonicity of the relative entropy implies strong subadditivity, we reexpress 
strong subadditivity in terms of the relative entropy, using the identity: 

S{B\A)^lndB-s(^PAB\\pA®^y (5) 

Proving this identity is a straightforward application of the definitions. Using this identity 
we may recast the conditional entropic form of strong subadditivity, Eq. (j^J , as an equivalent 
inequality between relative entropies: 

S (^Pab\\^ Pb^ < S (^PABc\\^ ® Pbc^ (6) 

This inequality obviously follows from the monotonicity of the relative entropy, and thus 
strong subadditivity also follows from the monotonicity of the relative entropy. 

Strong subadditivity and the monotonicity of the relative entropy have an interesting and 
lengthy history, and we will merely note a few highlights. The reader interested in a more 
thorough account should see, e.g., the discussion in ^J^] and the end notes to Chapter 11 

The original proof of strong subadditivity was by Lieb and Ruskai |S] , based on the beau- 
tiful concavity results of Lieb @] . Ruskai J3| has recently given an elegant exposition along 
the lines of this original proof. Monotonicity of the relative entropy was actually proved af- 
ter strong subadditivity, by Lindblad [Q) (see also Jl]). As already noted, our approach to 
strong subadditivity and monotonicity is due to Petz pjjj. Independently of Petz, Narnhofer 
and Thirring developed a related approach, based on similar broad ideas, but differing 
substantially in the details. 
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2 Background on operator convex functions 

The only background required for our proofs is a few simple facts from the theory of operator 
convex functions. The reader is referred to Chapter 5 of ^ for an introduction to this beautiful 
theory. 

Suppose / C R is an interval in the real line, and / : / ^ R is a real-valued function on /. 
We will define a corresponding map / : M„ Mn, where M„ is the space of n x n Hermitian 
matrices whose spectra lie in /. To define such a map, suppose D is an n x n diagonal matrix 
with real diagonal entries di, . . . ,c?„ G /. We define f{D) to be the n x n diagonal matrix 
with diagonal entries /(c?i), . . . , f{dn)- Generalizing this definition, if X is any element of M„ 
then we can write X — UDU^ for some unitary U and diagonal matrix D. We define the 
induced map / : M„ Mn by f{UDU^) = Uf{D)W. More informally, we work in a basis 
in which X is diagonal, and apply / to each of the diagonal entries. In cases where X can be 
decomposed in many different ways as X = UDU^ it is an easy exercise to show that f{X) 
does not depend upon the decomposition chosen. 

To define operator convexity, we first introduce a partial order on Hermitian matrices. 
Given Hermitian matrices X,Y Mn we define X<YitY — Xisa positive matrix. We say 
a function / : / ^ R is operator convex if for all n, for all X,Y Mn, and for all p £ [0, 1] 
we have fipX + (1 - p)Y) < pf{X) + {l-p)f{Y). 

Our later proofs use two simple lemmas about operator convexity, which we state at the end 
of this paragraph. We defer proofs of these lemmas until after the proof of the monotonicity 
of relative entropy, so as to not obscure the simplicity of the ideas used in that proof. 

Lemma 1: The function /(x) — — ln(a;) is operator convex. 

Lemma 2: If / is operator convex, and U : V ^ W is an isometry (where dim(y) < 
dim(iy)), then f{WXU) < Wf{X)U for aWX. 

3 Proof of the monotonicity of the relative entropy 

To appreciate the ideas used in proving monotonicity, it is helpful to look at the proof of 
the analogous classical result. This states that for probability distributions rjk and Sjk in 
two variables we have rj{\nrj — \nsj) < J2jk ^jfe(l^^^jfe ~ Insjfc), where rj = J2k ^jfe ^^"^ 
Sj = Sjk are the marginal probability distributions. This is easily seen to be equivalent 
to the inequality J2jk "^i^ TTs^ — ^-'^i*^^ '^^y proved by applying the calculus result 
In a; < a; — 1 to the left-hand side, and showing that the resulting expression vanishes. 

The difficulty in the quantum case is that the density matrices involved may not commute, 
and this prevents them from being combined in a single logarithm. To overcome this difficulty 
we reexpress the relative entropy S'(p||o') using a linear map on matrices known as the relative 
modular operator. In defining this operator we will assume that p and a are invertible; as 
a result, our proof of monotonicity of the relative entropy and of strong subadditivity only 
applies directly for invertible density matrices. The general results follow via a straightforward 
continuity argument, which we omit. 

To define the relative modular operator, we fix p and a and define linear maps on ma- 
trices C and TZ by C{X) = aX and TZ{X) = Xp^^, i.e., left multiplication by a, and right 

"We will follow the physicists' convention in often expecting the reader to work out from context the domain 
and range of mappings. Thus, in this example X is a Hermitian matrix on the space W, and with a spectrum 
lying within /, the domain of /. 
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multiplication by p^^. The relative modular operator is defined to be the product of these 
linear maps under composition, A = CTZ. Note that C and TZ commute, so we could equally 
well have written A — TZC. 

Our next step is to define a function In on linear maps on matrices, i.e., to define ln(£), 
where f is a linear map on matrices that is strictly positive with respect to the Hilbert-Schmidt 
inner product {X, Y) = ti{X'^Y). To do this we follow the same approach as described earlier 
in the section on operator convex functions, expanding £ in a diagonal basis as £ = ^j^ji 
and defining ln(£) = ln(Aj)£'j. 

With this definition, ln(£), ln(7?.), and ln(A) are all defined, i.e., £,7^, and A are all 
strictly positive with respect to the Hilbert-Schmidt inner product. To see that C is strictly 
positive observe that {X,C{X)) = tx{X'^aX) > for all non-zero X. The proof that Ti. is 
strictly positive follows similar lines. Finally, since A is a product of strictly positive and 
commuting linear maps on matrices, it follows that A is strictly positive. 

A little thought shows that \ii{C){X) = \\i{a)X and \n{n){X) = -X\n{p). Whatsmore, 
since £ and TZ commute, we obtain the beautiful relationship ln(A) = ln(£) + \niJZ). Some 
algebra shows that 

5(H|a) = (pl/^-ln(A)(pl/2)). (7) 

That is, the relative modular operator has enabled us to combine the logarithms in the 
definition of the relative entropy into a single logarithm, which greatly simplifies analysis. 
Using Eq. we may rewrite the monotonicity of the relative entropy in the equivalent form 

(py^-ln(A^)(pY^)} < {p'II^H^^b){pTb))^ (8) 

where the first inner product (•, •) is on the space M{A) of matrices acting on A, the 
second inner product is on the space M{AB) of matrices acting on AB, and /S.a{X) = 
aAX p^^ , AAsiX) = aAB^ Pab natural relative modular operators on systems A and 

AB, respectively. 

The final step in the proof is to find a linear map on matrices U : M{A) M{AB) 
such that: (1) U"^ AabU = A^; (2) U{p]l'^) = p^H; and (3) U is an isometry from M{A) 
to M{AB). It is not obvious such a U ought to exist. We explicitly construct U below, but 
for now we assume lA exists, and investigate the consequences. Using Eq. (jS)) we rewrite the 
monotonicity of the relative entropy as: 

{p'i\-HU^^ABU){pT)) 

< {p](1-HAab){p](b))- (9) 

But by Lemma 1 and Lemma 2 on the properties of operator convex functions we have 
-\ii{U''AabU) < -Wt \ii{Aab)U, and so 

{p](\-\n{ll^AABll){pT)) 

< {p'i\-UHu{AAB)U{pT)) (10) 
= {U{pT).-H^ab)U{pT)) (11) 
= {pTb,-H^ab)pTb)^ (12) 
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which completes the proof of monotonicity, provided we can find a U satisfying properties 
(f)-(3). Based on property (2) a plausible candidate is U{X) = {Xpj^^"^ ®Ib)p^jIb- With this 
definition, it is not difficult to check that W{Y) — tr b{Y p]l^{p^^^^ ®Ib)) is the corresponding 
adjoint operation, i.e., satisfies {U^Y),X) = {Y,U{X)) for all X G M{A) and Y G M{AB). 
Direct calculation now shows that U^AabI^ — A a and WU — Ia, which completes the list 
of desired properties, and the proof of monotonicity. 

This proof of monotonicity highlights the role of the operator convexity of f{x) — — ln(a;). 
If / is any operator convex function and we define an /-relative entropy hy Sf{p\\a) = 
{p^^^ T fi'^){p^^^))y the same argument shows that we obtain an analogous monotonicity prop- 
erty. 

4 Proofs of the operator convexity lemmas 

To prove Lemma I, we begin with a proof that f{x) = 1/x is operator convex on (0, oo). A 
key fact used in the proof is that ii X < Y, then ZXZ"^ < ZY Z'^ for any choice of Z, i.e., 
conjugation preserves matrix inequalities. The proof of this useful fact is a good exercise in 
applying the definition of <. 

To prove the operator convexity of /(x) = I/a:, let X < 1" be strictly positive Hermitian 
matrices. We begin with the special case X — I, where the goal is to prove {pl+ {l—p)Y)^^ < 
pi + {1 — p)Y~^ . Since / and Y commute, this result follows from the ordinary convexity of 
the real function l/x. 

To obtain the general operator convexity from the special case X = I, make the replace- 
ment Y X^^/^YX^^^^, which gives 

{pi+{i-p)x-^/^Yx-^/^y^ 

< pi +{l-p){X-^/^YX-^^^)-\ (13) 

Conjugating by X~^^'^ and doing a little algebra gives the desired inequality, and concludes 
the proof that f{x) = l/x is operator convex. 

The operator convexity of f(x) = — \n{x) is now established using the integral represen- 
tation — ln(2;) = dt ~ TTt) ' ^^'^^ which it follows that for a strictly positive matrix 
X we have 

/•oo 

-ln{X)^ / dt{{X + tI)-^^{I + tI)-^). (14) 
Jo 

Our goal is to show - \n{pX + (I - p)Y) < -p\n.{X) - {I - p) \n{Y). From Eq. (HH), this 
follows if we can prove {pX + {1 -p)Y + tiy^ < p{X + tl)~^ + {1 - p){Y + tl)~^ . Rewriting 
the left-hand side as [p(X + tl) + {1 — p){Y + tl)]^^ and applying the operator convexity of 
l/x gives the desired result, completing the proof of Lemma I. 

Moving to Lemma 2, note first a simple related result, namely, that when U maps the 
space V onto W, then directly from the definition of f{X) we obtain f{WXU) = U"^ f {X)U . 
This holds true regardless of whether / is operator convex or not. Lemma 2 requires a 
stronger hypothesis (the operator convexity of /), and gives rise to an inequality instead of an 
equality, but has the advantage that it holds when the range W' of C/ is a strict subset of W . 
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Readers familiar with the operator Jensen inequality (see, e.g., 0) may recognize Lemma 2 
as a variant of this result. 

To prove Lemma 2, let P be the projector onto W , and Q = I — P the projector onto 
the orthocomplement. As three separate vector spaces are involved, it is useful to introduce 
the notations fv, fw and fw' to denote the different ways / can act, e.g., fy takes as input 
a matrix acting on V, and produces as output a matrix acting on V, while fw takes as input 
a matrix acting on W, and produces as output a matrix acting on W. 

Note that PU = U, since P projects onto the range of U. As a result we have fviU'^ XU) = 
fv{U^P{PXP)PU). Note that PU is an isometry from V onto W, and since PXP may be 
regarded as a matrix acting on W, it follows that fviU'' P{PX P)PU) = WPfw'{PXP)PU. 
Summing up, we have shown that /y(C/tXC/) = WP fw'{PXP)PU. A little thought should 
convince you that to conclude the proof it will suffice to show that fw'{PXP) < Pfw{^)P- 
Proving this inequality now becomes our objective. 

We observe that 

fw'iPXP) = Pfw{PXP)P = Pfw{PXP + QXQ)P, 

(15) 

since fw{PXP + QXQ) = fw{PXP) + fw{QXQ) and Pfw{QXQ)P = 0. Defining a 
unitary S = P — Q on W, and recalling the P + Q = I, we have 

^ + f = + + Q) + - Q)xiP - Q) ^ ^ Q^Q^ 

for arbitrary X. Applying the operator convexity of / gives fw{PXP + QXQ) < (/iy(X) + 
fwiSXS'i))/2, and since fwiSXS'<) = SfwiX)S'< we obtain fw{PXP+QXQ) < {fw{X) + 
SfwiX)S'i)/2 = Pfw{X)P+Qfw{X)Q. Conjugating by P we obtain P/vi/(PXP+QXQ)P < 
Pfw{X)P. Combining this inequahty with Eq. (O gives fw'{PXP) < Pfw{X)P, which, 
as noted above, is sufficient to establish Lemma 2. 
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